智能论文笔记

Inference of Nonlinear Partial Differential Equations via Constrained Gaussian Processes

Zhaohui Li , Shihao Yang , Jeff Wu

分类： (统计)机器学习

2022-12-22

Partial differential equations (PDEs) are widely used for description of physical and engineering phenomena. Some key parameters involved in PDEs, which represents certain physical properties with important scientific interpretations, are difficult or even impossible to be measured directly. Estimation of these parameters from noisy and sparse experimental data of related physical quantities is an important task. Many methods for PDE parameter inference involve a large number of evaluations of numerical solution of PDE through algorithms such as finite element method, which can be time-consuming especially for nonlinear PDEs. In this paper, we propose a novel method for estimating unknown parameters in PDEs, called PDE-Informed Gaussian Process Inference (PIGPI). Through modeling the PDE solution as a Gaussian process (GP), we derive the manifold constraints induced by the (linear) PDE structure such that under the constraints, the GP satisfies the PDE. For nonlinear PDEs, we propose an augmentation method that transfers the nonlinear PDE into an equivalent PDE system linear in all derivatives that our PIGPI can handle. PIGPI can be applied to multi-dimensional PDE systems and PDE systems with unobserved components. The method completely bypasses the numerical solver for PDE, thus achieving drastic savings in computation time, especially for nonlinear PDEs. Moreover, the PIGPI method can give the uncertainty quantification for both the unknown parameters and the PDE solution. The proposed method is demonstrated by several application examples from different areas.

translated by 谷歌翻译

Supervised Homogeneity Fusion: a Combinatorial Approach

Wen Wang , Shihao Wu , Ziwei Zhu , Ling Zhou , Peter X. -K. Song

分类： (统计)机器学习 | 机器学习

2022-01-04

将回归系数融合到均匀组中可以揭示在每个组内共享共同值的系数。这种扩展均匀性降低了参数空间的内在尺寸，并释放统计学精度。我们提出并调查了一个名为$ l_0 $ -fusion的新的组合分组方法，这些方法可用于混合整数优化（MIO）。在统计方面，我们识别称为分组灵敏度的基本量，该基本量为恢复真实组的难度。我们展示$ l_0 $ -fusion在分组灵敏度的最弱需求下实现了分组一致性：如果违反了这一要求，则小组拼写的最低风险将无法收敛到零。此外，我们展示了在高维制度中，可以使用无需任何必要的统计效率损失的确保筛选特征，同时降低计算成本的校正特征耦合耦合的$ L_0 $ -Fusion。在算法方面，我们为$ l_0 $ -fusion提供了一个mio配方，以及温暖的开始策略。仿真和实际数据分析表明，在分组准确性方面，$ L_0 $ -FUSUS展示其竞争对手的优势。

translated by 谷歌翻译

BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics

Liang Ma , Shuyang Cao , Robert L. Logan IV , Di Lu , Shihao Ran , Ke Zhang , Joel Tetreault , Aoife Cahill , Alejandro Jaimes

分类：自然语言处理

2022-12-20

The proliferation of automatic faithfulness metrics for summarization has produced a need for benchmarks to evaluate them. While existing benchmarks measure the correlation with human judgements of faithfulness on model-generated summaries, they are insufficient for diagnosing whether metrics are: 1) consistent, i.e., decrease as errors are introduced into a summary, 2) effective on human-written texts, and 3) sensitive to different error types (as summaries can contain multiple errors). To address these needs, we present a benchmark of unfaithful minimal pairs (BUMP), a dataset of 889 human-written, minimally different summary pairs, where a single error (from an ontology of 7 types) is introduced to a summary from the CNN/DailyMail dataset to produce an unfaithful summary. We find BUMP complements existing benchmarks in a number of ways: 1) the summaries in BUMP are harder to discriminate and less probable under SOTA summarization models, 2) BUMP enables measuring the consistency of metrics, and reveals that the most discriminative metrics tend not to be the most consistent, 3) BUMP enables the measurement of metrics' performance on individual error types and highlights areas of weakness for future work.

translated by 谷歌翻译

Focal-PETR: Embracing Foreground for Efficient Multi-Camera 3D Object Detection

Shihao Wang , Xiaohui Jiang , Ying Li

分类：计算机视觉

2022-12-11

The dominant multi-camera 3D detection paradigm is based on explicit 3D feature construction, which requires complicated indexing of local image-view features via 3D-to-2D projection. Other methods implicitly introduce geometric positional encoding and perform global attention (e.g., PETR) to build the relationship between image tokens and 3D objects. The 3D-to-2D perspective inconsistency and global attention lead to a weak correlation between foreground tokens and queries, resulting in slow convergence. We propose Focal-PETR with instance-guided supervision and spatial alignment module to adaptively focus object queries on discriminative foreground regions. Focal-PETR additionally introduces a down-sampling strategy to reduce the consumption of global attention. Due to the highly parallelized implementation and down-sampling strategy, our model, without depth supervision, achieves leading performance on the large-scale nuScenes benchmark and a superior speed of 30 FPS on a single RTX3090 GPU. Extensive experiments show that our method outperforms PETR while consuming 3x fewer training hours. The code will be made publicly available.

translated by 谷歌翻译

Accounting for Temporal Variability in Functional Magnetic Resonance Imaging Improves Prediction of Intelligence

Yang Li , Xin Ma , Raj Sunderraman , Shihao Ji , Suprateek Kundu

分类：机器学习

2022-11-11

Neuroimaging-based prediction methods for intelligence and cognitive abilities have seen a rapid development in literature. Among different neuroimaging modalities, prediction based on functional connectivity (FC) has shown great promise. Most literature has focused on prediction using static FC, but there are limited investigations on the merits of such analysis compared to prediction based on dynamic FC or region level functional magnetic resonance imaging (fMRI) times series that encode temporal variability. To account for the temporal dynamics in fMRI data, we propose a deep neural network involving bi-directional long short-term memory (bi-LSTM) approach that also incorporates feature selection mechanism. The proposed pipeline is implemented via an efficient GPU computation framework and applied to predict intelligence scores based on region level fMRI time series as well as dynamic FC. We compare the prediction performance for different intelligence measures based on static FC, dynamic FC, and region level time series acquired from the Adolescent Brain Cognitive Development (ABCD) study involving close to 7000 individuals. Our detailed analysis illustrates that static FC consistently has inferior prediction performance compared to region level time series or dynamic FC for unimodal rest and task fMRI experiments, and in almost all cases using a combination of task and rest features. In addition, the proposed bi-LSTM pipeline based on region level time series identifies several shared and differential important brain regions across task and rest fMRI experiments that drive intelligence prediction. A test-retest analysis of the selected features shows strong reliability across cross-validation folds. Given the large sample size from ABCD study, our results provide strong evidence that superior prediction of intelligence can be achieved by accounting for temporal variations in fMRI.

translated by 谷歌翻译

Learning Visual Representation of Underwater Acoustic Imagery Using Transformer-Based Style Transfer Method

Xiaoteng Zhou , Changli Yu , Shihao Yuan , Xin Yuan , Hangchi Yu , Citong Luo

分类：计算机视觉

2022-11-10

Underwater automatic target recognition (UATR) has been a challenging research topic in ocean engineering. Although deep learning brings opportunities for target recognition on land and in the air, underwater target recognition techniques based on deep learning have lagged due to sensor performance and the size of trainable data. This letter proposed a framework for learning the visual representation of underwater acoustic imageries, which takes a transformer-based style transfer model as the main body. It could replace the low-level texture features of optical images with the visual features of underwater acoustic imageries while preserving their raw high-level semantic content. The proposed framework could fully use the rich optical image dataset to generate a pseudo-acoustic image dataset and use it as the initial sample to train the underwater acoustic target recognition model. The experiments select the dual-frequency identification sonar (DIDSON) as the underwater acoustic data source and also take fish, the most common marine creature, as the research subject. Experimental results show that the proposed method could generate high-quality and high-fidelity pseudo-acoustic samples, achieve the purpose of acoustic data enhancement and provide support for the underwater acoustic-optical images domain transfer research.

translated by 谷歌翻译

Deep Convolutional Neural Network and Transfer Learning for Locomotion Intent Prediction

Duong Le , Shihao Cheng , Robert D. Gregg , Maani Ghaffari

分类：机器人

2022-09-26

在不同的运动模式之间切换（例如，楼梯上升/下降，坡道上升/下降）时，动力的假肢腿必须预见用户的意图。许多数据驱动的分类技术已经证明了预测用户意图的有希望的结果，但是这些意图预测模型对新主题的表现仍然不受欢迎。在其他域（例如，图像分类）中，通过从大型数据集（即预训练的模型）中使用先前学习的功能，然后将此学模型转移到可用的新任务中，可以提高转移学习的精度。在本文中，我们开发了一个基于人类运动数据集的内部受试者（受试者）和主体间（主体独立）验证的深卷卷神经网络。然后，我们使用剩下的主题中的一小部分（10％）将转移学习应用于主题独立的模型。我们比较了这三个模型的性能。我们的结果表明，转移学习（TL）模型的表现优于主题无关（IND）模型，并且与主题依赖性（DEP）模型（DEP错误：0.74 $ \ pm $ 0.002％，IND错误：11.59 $ \ \ PM $ 0.076％，TL错误：3.57 $ \ pm $ 0.02％，有10％的数据）。此外，正如预期的那样，随着剩余主题的更多数据的可用性，转移学习精度会提高。我们还通过各种传感器配置评估了意图预测系统的性能，这些传感器配置可能会在假肢应用程序中可用。我们的结果表明，假体的大腿IMU足以预测实践中的运动意图。

translated by 谷歌翻译

DytanVO: Joint Refinement of Visual Odometry and Motion Segmentation in Dynamic Environments

Shihao Shen , Yilin Cai , Wenshan Wang , Sebastian Scherer

分类：计算机视觉 | 机器人

2022-09-17

基于学习的视觉探针计（VO）算法在常见的静态场景上实现了显着的性能，受益于高容量模型和大量注释的数据，但在动态，填充的环境中往往会失败。语义细分在估计摄像机动作之前主要用于丢弃动态关联，但以丢弃静态功能为代价，并且很难扩展到看不见的类别。在本文中，我们利用相机自我运动和运动分割之间的相互依赖性，并表明两者都可以在单个基于学习的框架中共同完善。特别是，我们提出了Dytanvo，这是第一个涉及动态环境的基于学习的VO方法。它需要实时两个连续的单眼帧，并以迭代方式预测相机的自我运动。我们的方法在现实世界动态环境中的最先进的VOUTESS的平均提高27.7％，甚至在动态视觉SLAM系统中进行竞争性，从而优化了后端的轨迹。在很多看不见的环境上进行的实验也证明了我们的方法的普遍性。

translated by 谷歌翻译

Towards Bridging the Performance Gaps of Joint Energy-based Models

Xiulong Yang , Qing Su , Shihao Ji

分类：计算机视觉

2022-09-16

我们可以在单个网络中训练混合歧视生成模型吗？最近在肯定中回答了这个问题，引入了基于联合能量的模型（JEM）的领域，该模型（JEM）同时达到了高分类的精度和图像生成质量。尽管有最近的进步，但仍存在两个性能差距：标准软磁性分类器的准确性差距，以及最先进的生成模型的发电质量差距。在本文中，我们引入了各种培训技术，以弥合JEM的准确性差距和一代质量差距。 1）我们结合了最近提出的清晰度最小化（SAM）框架来训练JEM，从而促进了能量景观的平滑度和JEM的普遍性。 2）我们将数据扩展排除在JEM的最大似然估计管道中，并减轻数据增强对图像生成质量的负面影响。在多个数据集上进行的广泛实验表明，我们的Sada-Jem在图像分类，图像产生，校准，分布外检测和对抗性鲁棒性方面实现了最先进的表现，并优于JEM JEM。

translated by 谷歌翻译

Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model

Xiulong Yang , Sheng-Min Shih , Yinlin Fu , Xiaoting Zhao , Shihao Ji

分类：计算机视觉

2022-08-16

扩散降级概率模型（DDPM）和视觉变压器（VIT）分别在生成任务和判别任务中表现出重大进展，到目前为止，这些模型已在其自身领域中很大程度上开发出来。在本文中，我们通过将VIT结构集成到DDPM之间，建立DDPM和VIT之间的直接联系，并引入一种称为“生成Vit（Genvit）”的新生成模型。VIT的建模灵活性使我们能够将Genvit进一步扩展到混合判别生成建模，并引入混合VIT（HYBVIT）。我们的工作是最早探索单个VIT以共同探索图像生成和分类的人之一。我们进行了一系列实验，以分析提出的模型的性能，并证明它们在生成和判别任务中都超过了先前的最新技术。我们的代码和预培训模型可以在https://github.com/sndnyang/diffusion_vit中找到。

translated by 谷歌翻译